Towards a Universal Sentiment Classifier in Multiple languages
نویسندگان
چکیده
Existing sentiment classifiers usually work for only one specific language, and different classification models are used in different languages. In this paper we aim to build a universal sentiment classifier with a single classification model in multiple different languages. In order to achieve this goal, we propose to learn multilingual sentiment-aware word embeddings simultaneously based only on the labeled reviews in English and unlabeled parallel data available in a few language pairs. It is not required that the parallel data exist between English and any other language, because the sentiment information can be transferred into any language via pivot languages. We present the evaluation results of our universal sentiment classifier in five languages, and the results are very promising even when the parallel data between English and the target languages are not used. Furthermore, the universal single classifier is compared with a few cross-language sentiment classifiers relying on direct parallel data between the source and target languages, and the results show that the performance of our universal sentiment classifier is very promising compared to that of different crosslanguage classifiers in multiple target languages.
منابع مشابه
یک چارچوب نیمهنظارتی مبتنی بر لغتنامه وفقی خودساخت جهت تحلیل نظرات فارسی
With the appearance of Web 2.0 and 3.0, users’ contribution to WWW has created a huge amount of valuable expressed opinions. Considering the difficulty or impossibility of manually analyzing such big data, sentiment analysis, as a branch of natural language processing, has been highly considered. Despite the other (popular) languages, a limited number of research studies have been conducted in ...
متن کاملTowards Cross-Language Sentiment Analysis through Universal Star Ratings
The abundance of sentiment-carrying user-generated content renders automated cross-language information monitoring tools crucial for today’s businesses. In order to facilitate cross-language sentiment analysis, we propose to compare the sentiment conveyed by unstructured text across languages through universal star ratings for intended sentiment. We demonstrate that the way natural language rev...
متن کاملLanguage-Independent Twitter Sentiment Analysis
Millions of tweets posted daily contain opinions and sentiment of users in a variety of languages. Sentiment classification can benefit companies by providing data for analyzing customer feedback for products or conducting market research. Sentiment classifiers need to be able to handle tweets in multiple languages to cover a larger portion of the available tweets. Traditional classifiers are h...
متن کاملLexicon-based sentiment analysis by mapping conveyed sentiment to intended sentiment
As consumers nowadays generate increasingly more content describing their experiences with, e.g., products and brands in various languages, information systems monitoring a universal, languageindependent measure of people’s intended sentiment are crucial for today’s businesses. In order to facilitate sentiment analysis of user-generated content, we propose to map sentiment conveyed by unstructu...
متن کاملMultilingual Summarisation and Sentiment Analysis
Summarisation and sentiment analysis are the key NLP technologies which allow monitoring evolving content and opinions in huge amounts of textual data available on the web. Summarisa-tion can address the problem of information overload by extracting and presenting the main content and sentiment analysis can identify opinions expressed towards entities or events. Because there can be found so ma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017